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REMARKS 

The above Amendments are offered to more clearly reference the formal drawings filed 
herewith. In particular, Figures 13E, 13F, and 13G were renumbered as Figures 14A, 14B, and 
14C, respectively. Accordingly, changes in certain text references are necessary. 

Attached hereto is a marked-up version of the changes made to the specification and 
claims by the current amendment. The attached page is captioned "Version with markings to 
show changes made." 

In view of the above, each of the presently pending claims in this application is believed 
to be in immediate condition for allowance. Accordingly, the Examiner is respectfully requested 
to pass this application to issue. 

Applicant believes no fee is due with this response. However, if a fee is due, please 
charge our Deposit Account No. 18-1945, under Order No. ANVI-PO 1-001 from which the 
undersigned is authorized to draw. 

Dated: July 17, 2002 Respectfully submitted, 
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Version With Markings to Show Changes Made 

[0001] This application claims the benefit of U.S. provisional patent application Serial No. 
60/285,385, filed April 20, 2001 [J; U.S. provisional patent application Serial No. 60/285,945, 
filed April 23, 2001 [,]; U.S. provisional patent application Serial No. 60/322,771, filed 
September 17, 200 1[,]; and U.S. provisional application identified by Attorney Docket Code 
ANV-003PR, entitled Multi-Dimensional Interactive Data Visualization Applied To Small 
Molecule Research, filed January 15, 2002 (Serial No. 60/348,854), all of which applications are 
incorporated herein in their entirety by reference. 

[0002] This application is related to U.S. patent application identified by Attorney Docket Code 
ANV-002, entitled "Method And System For Data Analysis" (Serial No. 10/077,586) and to U.S. 
patent application identified by Attorney Docket Code ANV-004, and entitled "Method And 
System For Data Analysis" (Serial No. 10/077,692), both of which are filed on even date 
herewith and incorporated herein in their entirety by reference. 

[0077] Figure [13E] 14A is a GUI screen display depicting a table-like visualization of the data 
of Figure 13D according to an illustrative embodiment of the attribute reduction subsystem; 

[0078] Figure [13F] 14B is a GUI screen display depicting the table-like visualization of 
Figure [13E] 14 A subsequent to sorting according to an illustrative embodiment of the attribute 
reduction subsystem; 

[0079] Figure [13G] 14C is a GUI screen display depicting a multiple line graph 
transformation of the data of Figure [13F] 14B according to an illustrative embodiment of the 
invention; 

Delete paragraph [0080] 

[0186] An extension of this technique is animating the display, where the attribute positions 
along the locus are consecutively shifted by a skip factor, such as one. That is, a fixed number of 
attributes (called a frame) are laid out on the locus. The number of attributes per frame is equal 
to the period of the time cycle data. The total number of attributes plotted typically includes 
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several frame cycles worth. The radial visualization is then animated to show consecutive 
frames of data. Each individual display of the animation shows the same attributes, but with the 
attribute locations incremented by the skip factor. One advantage of this technique is that it can 
show data points that have unique time varying dependencies that are not seen in other 
visualizations. Some examples are discussed below with respect to Figures 13A-13[G] Dand 
14A-14C . 

[0191] Figure [13E] 14A is a GUI display screen 1307 of a table-like display 1344 of the type 
generated by the attribute reduction subsystem 102 of the invention. In the table 1344, the time 
samples T1-T100 are shown along the right margin. Each column of the table 1344 represents 
one of a thousand genes. The binned shading represents the gene expression values at each of 
the one hundred time samples T1-T100. However, with the time samples clustered and the 
records sorted by Tl, in accord with the methods discussed herein, ten groups 1346a-1346k of 
time intervals T1-T100 emerge. We can also see that the time samples Tl, Tl 1, T21, . . ., T91 
are in phase with each other, but ninety degrees out of phase with the time samples T6, T16, T26, 
. . ., T96. Thus, the table 1344 provides additional information regarding analysis of time varying 
dependencies. The sinusoidal nature of the time dependencies of the data set of Figures 13A- 
[13F] 13D and 14A-14B is further illustrated in the display 1311 of Figure [13G] 14C, which 
displays a multiple line graph representation of the data of Figure [13F] MB. An illustrative 
process for such transformation is discussed above with respect to Figures 1 1 A-l IE. 

[0192] As described above, according to the illustrative embodiment, the record categorization 
subsystem 104 employs the AP layout algorithm to determine the attribute positioning to realize 
the category separations of Figure 12C. Details of the illustrative AP algorithm are described 
next [with respect to Figures 14A-14C]. 

[0193] [Figure 14A is] In one embodiment a display screen of a radial visualization [1400 
showing] shows a 76 gene attribute subset [1402] of the Affymetrix™ gene set randomly 
arranged on the perimeter of a circular locus [1404]. The records (patients) [1406] are plotted 
within the locus [1404] in a manner such as described with respect to Figures 12A-12C. [The 
dark dots 1408 indicate patients known to have ALL-type leukemia, while the light gray dots 
1410 indicate patients known to have AML-type leukemia.] To test the 76 gene subset to 


8858183 


9 


Application No.: 10/077694 


Docket No.: ANVI-PO 1 -00 1 


determine if it is result-effective and/or to calibrate the radial visualization [1400] the illustrative 
record categorization subsystem 104 employs the AP algorithms. 

[0194] The AP algorithms use class distinction [1402] metrics to assign the positions of the 
attributes on the locus [1404], In the illustrative embodiment, the metric employed is t-statistics. 
The t-statistic is calculated for each column (gene attribute [1402]) by comparing all of the ALL 
values with all of the AML values in each column. The t-statistic is a standard statistical test for 
comparing two groups using the means and standard deviations. The t-statistic for each attribute 
[1402] determines the order of the attributes [1402] around the perimeter of the locus [1404]. 

[0195] [Referring to Figure 14B, the]The genes or columns [1402] that have higher values for 
ALL are laid out in the top half [1412] of the locus [1404], the genes or columns [1402] that 
have higher values for AML are laid out in the bottom half [1414] of the locus [1404]. The order 
of the genes [1402] are by t-statistic value. In the top half [1412] of the locus [1404], the genes 
[1402] are ordered right to left with the most significant gene [1416] on the right and the least 
significant gene [1416] on the left. In the bottom half [1414] the genes [1402] are ordered with 
significance going from left [1420] to right [1422]. 

[0196] The columns (genes [1402]) are laid out around the locus [1404] perimeter with the 
column that has the highest t-statistic (negative) value at the gene [1416 in the diagram. Gene 
1416 is most significant for having] that has a higher mean for ALL than AML. One gene 
[Gene] or column [1420] is most significant for having higher mean values for AML than ALL. 

[0197] [As can be seen in Figure 14B, use]Use of the AP algorithms result in a relatively clean 
separation between the patients [1408] having AML-type leukemia and the patients [1410] 
having ALL-type leukemia. 

[0198] Since the illustrative AP algorithm described above ranks the significance of the 
attributes [1402], the operator may also employ the AP algorithms for attribute reduction. More 
specifically, subsets of the most significant attributes [1402] may be examined to determine 
further reduced, result-effective attribute subsets. By way of example, [Figure 14C is a screen 
shot of] a radial visualization [1424] employing the top five most significant genes for ALL 
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[1426] and AML [1428. As can be seenl shows that using this attribute subset, the AML-type 
patients [1408] and the ALL-type patients [1410] continue to clearly divide. Thus, the AP 
algorithms employed by illustrative record categorization subsystem 104 not only provide record 
categorization features, but also attribute reduction features. 

[0228] Figure 24 is a GUI screen image 2400 depicting a multi-dimensional polygonal 
visualization 2402 in the display panel 1706 according to an illustrative embodiment of the 
invention. The polygonal visualization 2402 includes a number of records 2408 disposed at 
locations determined in relation to a plurality of attributes 2404 by way of the methodology 
discussed above[ with respect to Figures 14A-15C]. The attributes on the locus of the circle are 
extended to form lines. This line now represents the attribute with the minimum value at one end 
of the line and the maximum value at the other. Thus, it is an axis and this yields a polygonal 
display. Each record in the display has a value for that attribute and the line from the attribute 
value points to the record. In many cases the values for each attribute have a distribution which 
is represented on the attribute line, thus yielding multiple lines pointing to the records. This is 
similar to parallel coordinates for which the lines represent the axes. In Figure 24, the control 
panel 1708 includes a button 2404 which enables an operator to activate global parameters 
during the display of the polygonal visualization 2402, and slider controls 2410a and 2410b 
which control the resolution of data in the X and Y directions, respectively. The control panel 
1708 of Figure 24 also includes a plurality of check boxes 2412a-2412f. The check boxes 
2412a-2412f control whether a floating probe is displayed, and if so, the features of the 
information displayed using to the floating probe. The floating point probe displays actual 
attribute values. The control panel 1708 further includes a pull-down menu 2414 which selects a 
region mode. The region mode menu 2414 enables an operator to select a region of the 
visualization 2402 for display and/or analysis by way of a pointing device, such as a mouse. The 
control panel 1708 also provides a series of user interactive dialog boxes 2416a-2416e for 
manipulating the forces applied to the records 2808 during plotting on the locus 2406. An 
operator enters a desired force equation in the dialog box 2418. To enter a force equation into 
any of the dialog boxes 2416a-2416e, the operator enters the force equation in the dialog box 
2418 and then selects one or more dialog boxes 2416a-2416e to indicate to which of the 
attributes the entered force equation is to be applied. 
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[0238] Figure 32 is a GUI screen image 3200 illustrating the "Data" pull-down menu 1716f. In 
Figure 32, the "Data" 1716f menu has been activated. The resulting pull-down menu 3200 
includes entries "Do All Sort" 3202a, "Sum Sort of Records" 3202b, "Show Table . . ." 3202c, 
"Set Missing Values" 3202d, and "Pivot Data" 3202e. The use of the pull-down menu 
commands 3202a-3202e follows the conventional method of highlighting a command with a 
mouse or other pointing device and activating the highlighted command with an action such as a 
mouse button click. The "Sort" commands_3202a and 3202b are used to sort by rows or columns 
in the gray scale binned table 3204. The "Show Table . . ." command 5360c displays the 
numerical data corresponding to entries in the table 3204. The "Set Missing Values" command 
3202d inserts missing values according to the condition assigned by the operator for missing 
values, as discussed above, or in the absence of such action by the operator, inserting default 
values. The "Pivot Data" command 3202e causes the exchange of rows and columns in the table 
3204. 

[0279] During practice of the invention, it has been discovered that various other subgroups of 
the 6817 genes, the expression products of which were tested in Golub et al (1999) supra, can 
be used to identify and distinguish individuals with AML, B ALL and T ALL. Three classes of 
genes comprising 76 genes, 57 genes and 3 genes were identified using different forms of the 
algorithms described herein. For example, 76 gene products have been identified using the 
methods and systems described herein which can be used to identify AML patients that respond 
differently to treatment regimes (see, Figure 34). Figure 34 shows criteria for distinguishing 
between individuals 3402_with AML that respond to chemotherapy from those 3404 _that do not 
respond to chemotherapy. The 76 genes are identified in Table 1 below together with their 
GenBank accession numbers, the sequences of which are incorporated herein by reference. The 
sequences can be obtained through the National Center for Biotechnology Information (NCBI) 
web site at www.ncbi.nlm.nih.gov. 

[0301] Figure 49 depicts a GUI screen image 4900 of parameters for the AP algorithm, as 
described above [ with respect to Figures 14A-14C]. The GUI screen image 4900 shows a "Set 
Discrimination Threshold" dialog box 4902 that enables the selection of parameters for class 
distinction. The "Set Discrimination Threshold" dialog box 4902 enables the selection of GS, 
option 1 , and option 2, and the selection of a positive differential selection or a negative 
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differential selection. The GS, option 1, and option 2 select differential statistical measures for 
laying out the attributes. Further, a significance level is employed upon the selection of the "Use 
Significance Level" checkbox 4904. Moreover, the dialog box 4902 enables an input of a 
threshold value 4906 and/or a maximum class size 4908. 
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